51 research outputs found
"I Won the Election!": An Empirical Analysis of Soft Moderation Interventions on Twitter
Over the past few years, there is a heated debate and serious public concerns
regarding online content moderation, censorship, and the principle of free
speech on the Web. To ease these concerns, social media platforms like Twitter
and Facebook refined their content moderation systems to support soft
moderation interventions. Soft moderation interventions refer to warning labels
attached to potentially questionable or harmful content to inform other users
about the content and its nature while the content remains accessible, hence
alleviating concerns related to censorship and free speech. In this work, we
perform one of the first empirical studies on soft moderation interventions on
Twitter. Using a mixed-methods approach, we study the users who share tweets
with warning labels on Twitter and their political leaning, the engagement that
these tweets receive, and how users interact with tweets that have warning
labels. Among other things, we find that 72% of the tweets with warning labels
are shared by Republicans, while only 11% are shared by Democrats. By analyzing
content engagement, we find that tweets with warning labels had more engagement
compared to tweets without warning labels. Also, we qualitatively analyze how
users interact with content that has warning labels finding that the most
popular interactions are related to further debunking false claims, mocking the
author or content of the disputed tweet, and further reinforcing or resharing
false claims. Finally, we describe concrete examples of inconsistencies, such
as warning labels that are incorrectly added or warning labels that are not
added on tweets despite sharing questionable and potentially harmful
information.Comment: Accepted in the 15th AAAI Conference on Web and Social Media (ICWSM
2021
A Quantitative Approach to Understanding Online Antisemitism
A new wave of growing antisemitism, driven by fringe Web communities, is an
increasingly worrying presence in the socio-political realm. The ubiquitous and
global nature of the Web has provided tools used by these groups to spread
their ideology to the rest of the Internet. Although the study of antisemitism
and hate is not new, the scale and rate of change of online data has impacted
the efficacy of traditional approaches to measure and understand these
troubling trends. In this paper, we present a large-scale, quantitative study
of online antisemitism. We collect hundreds of million posts and images from
alt-right Web communities like 4chan's Politically Incorrect board (/pol/) and
Gab. Using scientifically grounded methods, we quantify the escalation and
spread of antisemitic memes and rhetoric across the Web. We find the frequency
of antisemitic content greatly increases (in some cases more than doubling)
after major political events such as the 2016 US Presidential Election and the
"Unite the Right" rally in Charlottesville. We extract semantic embeddings from
our corpus of posts and demonstrate how automated techniques can discover and
categorize the use of antisemitic terminology. We additionally examine the
prevalence and spread of the antisemitic "Happy Merchant" meme, and in
particular how these fringe communities influence its propagation to more
mainstream communities like Twitter and Reddit. Taken together, our results
provide a data-driven, quantitative framework for understanding online
antisemitism. Our methods serve as a framework to augment current qualitative
efforts by anti-hate groups, providing new insights into the growth and spread
of hate online.Comment: To appear at the 14th International AAAI Conference on Web and Social
Media (ICWSM 2020). Please cite accordingl
Reading In-Between the Lines: An Analysis of Dissenter
Efforts by content creators and social networks to enforce legal and
policy-based norms, e.g. blocking hate speech and users, has driven the rise of
unrestricted communication platforms. One such recent effort is Dissenter, a
browser and web application that provides a conversational overlay for any web
page. These conversations hide in plain sight - users of Dissenter can see and
participate in this conversation, whereas visitors using other browsers are
oblivious to their existence. Further, the website and content owners have no
power over the conversation as it resides in an overlay outside their control.
In this work, we obtain a history of Dissenter comments, users, and the
websites being discussed, from the initial release of Dissenter in Feb. 2019
through Apr. 2020 (14 months). Our corpus consists of approximately 1.68M
comments made by 101k users commenting on 588k distinct URLs. We first analyze
macro characteristics of the network, including the user-base, comment
distribution, and growth. We then use toxicity dictionaries, Perspective API,
and a Natural Language Processing model to understand the nature of the
comments and measure the propensity of particular websites and content to
elicit hateful and offensive Dissenter comments. Using curated rankings of
media bias, we examine the conditional probability of hateful comments given
left and right-leaning content. Finally, we study Dissenter as a social
network, and identify a core group of users with high comment toxicity.Comment: Accepted at IMC 202
Before Blue Birds Became X-tinct: Understanding the Effect of Regime Change on Twitter's Advertising and Compliance of Advertising Policies
Social media platforms, including Twitter (now X), have policies in place to
maintain a safe and trustworthy advertising environment. However, the extent to
which these policies are adhered to and enforced remains a subject of interest
and concern. We present the first large-scale audit of advertising on Twitter
focusing on compliance with the platform's advertising policies, particularly
those related to political and adult content. We investigate the compliance of
advertisements on Twitter with the platform's stated policies and the impact of
recent acquisition on the advertising activity of the platform. By analyzing
34K advertisements from ~6M tweets, collected over six months, we find evidence
of widespread noncompliance with Twitter's political and adult content
advertising policies suggesting a lack of effective ad content moderation. We
also find that Elon Musk's acquisition of Twitter had a noticeable impact on
the advertising landscape, with most existing advertisers either completely
stopping their advertising activity or reducing it. Major brands decreased
their advertising on Twitter, suggesting a negative immediate effect on the
platform's advertising revenue. Our findings underscore the importance of
external audits to monitor compliance and improve transparency in online
advertising
The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans
A new era of Information Warfare has arrived. Various actors, including
state-sponsored ones, are weaponizing information on Online Social Networks to
run false information campaigns with targeted manipulation of public opinion on
specific topics. These false information campaigns can have dire consequences
to the public: mutating their opinions and actions, especially with respect to
critical world events like major elections. Evidently, the problem of false
information on the Web is a crucial one, and needs increased public awareness,
as well as immediate attention from law enforcement agencies, public
institutions, and in particular, the research community. In this paper, we make
a step in this direction by providing a typology of the Web's false information
ecosystem, comprising various types of false information, actors, and their
motives. We report a comprehensive overview of existing research on the false
information ecosystem by identifying several lines of work: 1) how the public
perceives false information; 2) understanding the propagation of false
information; 3) detecting and containing false information on the Web; and 4)
false information on the political stage. In this work, we pay particular
attention to political false information as: 1) it can have dire consequences
to the community (e.g., when election results are mutated) and 2) previous work
show that this type of false information propagates faster and further when
compared to other types of false information. Finally, for each of these lines
of work, we report several future research directions that can help us better
understand and mitigate the emerging problem of false information dissemination
on the Web
Who let the trolls out? Towards understanding state-sponsored trolls
Recent evidence has emerged linking coordinated campaigns by state-sponsored actors to manipulate public opinion on the Web. Campaigns revolving around major political events are enacted via mission-focused ?trolls." While trolls are involved in spreading disinformation on social media, there is little understanding of how they operate, what type of content they disseminate, how their strategies evolve over time, and how they influence the Web's in- formation ecosystem. In this paper, we begin to address this gap by analyzing 10M posts by 5.5K Twitter and Reddit users identified as Russian and Iranian state-sponsored trolls. We compare the behavior of each group of state-sponsored trolls with a focus on how their strategies change over time, the different campaigns they embark on, and differences between the trolls operated by Russia and Iran. Among other things, we find: 1) that Russian trolls were pro-Trump while Iranian trolls were anti-Trump; 2) evidence that campaigns undertaken by such actors are influenced by real-world events; and 3) that the behavior of such actors is not consistent over time, hence detection is not straightforward. Using Hawkes Processes, we quantify the influence these accounts have on pushing URLs on four platforms: Twitter, Reddit, 4chan's Politically Incorrect board (/pol/), and Gab. In general, Russian trolls were more influential and efficient in pushing URLs to all the other platforms with the exception of /pol/ where Iranians were more influential. Finally, we release our source code to ensure the reproducibility of our results and to encourage other researchers to work on understanding other emerging kinds of state-sponsored troll accounts on Twitter.https://arxiv.org/pdf/1811.03130.pdfAccepted manuscrip
The Pushshift Telegram Dataset
Messaging platforms, especially those with a mobile focus, have become
increasingly ubiquitous in society. These mobile messaging platforms can have
deceivingly large user bases, and in addition to being a way for people to stay
in touch, are often used to organize social movements, as well as a place for
extremists and other ne'er-do-well to congregate. In this paper, we present a
dataset from one such mobile messaging platform: Telegram. Our dataset is made
up of over 27.8K channels and 317M messages from 2.2M unique users. To the best
of our knowledge, our dataset comprises the largest and most complete of its
kind. In addition to the raw data, we also provide the source code used to
collect it, allowing researchers to run their own data collection instance. We
believe the Pushshift Telegram dataset can help researchers from a variety of
disciplines interested in studying online social movements, protests, political
extremism, and disinformation
"And We Will Fight For Our Race!" A Measurement Study of Genetic Testing Conversations on Reddit and 4chan
Rapid progress in genomics has enabled a thriving market for “direct-to-consumer” genetic testing, whereby people have access to their genetic information without the involvement of a healthcare provider. Companies like 23andMe and AncestryDNA, which provide affordable health, genealogy, and ancestry reports, have already tested tens of millions of customers. At the same time, alas, far-right groups have also taken an interest in genetic testing, using them to attack minorities and prove their genetic “purity.” However, the relation between genetic testing and online hate has not really been studied by the scientific community. To address this gap, we present a measurement study shedding light on how genetic testing is discussed on Web communities in Reddit and 4chan. We collect 1.3M comments posted over 27 months using a set of 280 keywords related to genetic testing. We then use Latent Dirichlet Allocation, Google’s Perspective API, Perceptual Hashing, and word embeddings to identify trends, themes, and topics of discussion. Our analysis shows that genetic testing is discussed frequently on Reddit and 4chan, and often includes highly toxic language expressed through hateful, racist, and misogynistic comments. In particular, on 4chan’s politically incorrect board (/pol/), content from genetic testing conversations involves several alt-right personalities and openly antisemitic memes. Finally, we find that genetic testing appears in a few unexpected contexts, and that users seem to build groups ranging from technology enthusiasts to communities using it to promote fringe political views
- …